Surprises in approximating Levenshtein distances

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximating Nearest Neighbor Distances

Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points. In this paper, we c...

متن کامل

Obliviously Approximating Sequence Distances

There are several applications for schemes which approximately nd the distance between two sequences in a way that isòblivious' of one of the sequences up until a nal sublinear number of comparisons. This paper shows how sequences can be preprocessed obliviously to give a binary string, so that a simple vector distance between two bitstrings gives an approximation to a sequence distance of inte...

متن کامل

Levenshtein Distances Fail to Identify Language Relationships Accurately

The Levenshtein distance is a simple distance metric derived from the number of edit operations needed to transform one string into another. This metric has received recent attention as a means of automatically classifying languages into genealogical subgroups. In this article I test the performance of the Levenshtein distance for classifying languages by subsampling three language subsets from...

متن کامل

Approximating Subtree Distances Between Phylogenies

We give a 5-approximation algorithm to the rooted Subtree-Prune-and-Regraft (rSPR) distance between two phylogenies, which was recently shown to be NP-complete. This paper presents the first approximation result for this important tree distance. The algorithm follows a standard format for tree distances. The novel ideas are in the analysis. In the analysis, the cost of the algorithm uses a "cas...

متن کامل

Generating a bilingual lexical corpus using interlanguage normalized Levenshtein distances

Finding large numbers of target items for phonetic and phonological experiments can be a time-consuming and error-prone task. Using freely available tools and data, we have generated a bilingual corpus with the specific aim of investigating the processing and perception of stress in second-language (L2) words. Normalized Levenshtein distances between orthographic and phonemic transcriptions of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Theoretical Biology

سال: 2006

ISSN: 0022-5193

DOI: 10.1016/j.jtbi.2006.06.026